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2. A MODEL FOR DISTANCE DEPENDENCE 

Assume the parameter to be estimated is u» and a single sample point 

2 

will have variance o and mean u. Assume further that samples are 
nonnegatively correlated with each other as a result of being geo- 
graphically close. Assuming details of geography are not known, 
an appropriate model may be derived from the following assumptions: 


(1) The correlation P. . between the sample values X., X. depends 

* J J 

only on the Euclidean distance A., between the locations of 

• J 

the samples. 

(2) If the sample points i,j,k are col linear witn j between i and k, 
then the correlation between X. and X^ depends only on the 
correlations between X. and X. and between X. and X. . Mathematically, 

I J J K 

this says that the partial correlation °ik.j * °- 


Assumption two simply means that the effect of one point on its neighbors 
is via its effect on intermediate points, a sort of "domino" action. 

From the definition of partial correlation (e.g. Tinui (1975)) it is an 


immediate consequence that P.. - p..*p.. . 

T K i j J K 


As a result we have 


Theorem: If (1) and (2) hold, then P^ = e *^ik for all pairs of sample 

locations i and k. 


Fitting the model thus requires that we estimate the parameter K. 


2-1 



3. MINIMUM VARIANCE UNBIASED ESTIMATION 
UNDER DISTANCE DEPENDENCE 

2 -KA* - 

The sample X^,...,X has covariance matrix cr C where C * (e ' ’J). 

We wish to find a vector t so that t'X is the minimum variance linear 
unbiased estimator for p. The variance of t X is then given by 
t*Ct; the unbiasedness constraint is t'l = 1 where 1 is the n x 1 vector 
of ones. Minimizing the variance and introducing the constraint via 
Lagrange multipliers we get 


t = 


C^T 

TV 1 ! 


The variance of the estimator t’X is then 


“ — it* 

TC *1 

C = I'C 1 since it is simply the sum of all terms in 

the matrix. Then 


Var(t'X) = -—t- 

Ec 1 

By contrast, the variance of the sample mean X is 


Var(X) = 



Thus the reduction in variance for our estimator compared to the sample 
mean is 


n 


2 


z^T? T 
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This is always less than our estimator, since the sample mean is a 
linear unbiased estimator, and our procedure has minimum variance in 
this class. 


To illustrate take the simple case where the sample points 1, 2, and 
3 are equally spaced on a straight line. Thus = P23 “ p and 
P-J2 = p^. We obtain that the estimator is 


t'X = J — X. + X, + J- X, 
3-p 1 3-p 2 3-p 3 


which has variance 


and reduction in variance 


9(l+p) 

(3-p)(3+4p+2p 2 ) 


If p =.5 we obtain a reduction factor of roughly .98. Significant 
reductions are obtained most readily when spacings are very unequal 
and correlations are high. 
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4. FITTING THE MODEL 


Ideally one would have sound theoretical reasons for choosing k in 
the correlation model. Lacking that, it is possible to estimate k 
from large samples from populations that are believed to have k simi- 
lar to the target population. 

If we know the distance a., between the sites of two sample points 

* J 

i and j, then by standard sampling considerations 



the residual variance associated with one of the sample points 
knowing the other. But 

2 



2 

From the triples (X.., X^, a^.) we can estimate K (and o ) by nonlinear 
least-squares techniques. 

An example (illustrating more than anything else the pitfalls of 
fitting the model) is shown in Table 1. The corn production in 
33 approximately one-square-mile sections in Missouri were obtained 
and the distances calculated between each pair. The distances are 
shown plotted against the absolute value of the difference in the 
percentage of acreage planted in corn in each section. The hoped-for 
trend is not apparent to the eye; one would expect an Increase in 
mean difference from lower left to upper right. The equation on 
page 5 was fitted to thi" data using the Statistical Analysis System 
(see SAS (1979)) program procedure NLIN. The best fit was k = .17, 
which would suggest that the correlation drops to 1/2 at a distance 
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Table 1. Differences In Com Production vs. Geographical Distance 



of four miles. Since this is a very short distance in this population, 
the sample is close enough to an independent random sample for all 
practical purposes, and the sample mean is quite adequate. Further- 
more, the model did not fit well enough to reassure us even of the 
positivity of K which is necessary in our model. Thus, the method 
fails to be helpful for this small data set. 
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5. CONCLUSION 


Further work needs to be done to test the efficacy of this approach 
when high geographical correlations are present and K can be esti- 
mated from either theoretical considerations or extensive previous 
experience. 
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